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The discrete-time GARCH methodology which has had such a profound influence on the mod- 
elling of heteroscedasticity in time series is intuitively well motivated in capturing many 'stylized 
facts' concerning financial series, and is now almost routinely used in a wide range of situations, 
often including some where the data are not observed at equally spaced intervals of time. How- 
ever, such data is more appropriately analyzed with a continuous-time model which preserves 
the essential features of the successful GARCH paradigm. One possible such extension is the 
diffusion limit of Nelson, but this is problematic in that the discrete-time GARCH model and its 
continuous-time diffusion limit are not statistically equivalent. As an alternative, Kluppelberg 
et al. recently introduced a continuous-time version of the GARCH (the 'COGARCH' process) 
which is constructed directly from a background driving Levy process. The present paper shows 
how to fit this model to irregularly spaced time series data using discrete-time GARCH method- 
ology, by approximating the COGARCH with an embedded sequence of discrete-time GARCH 
series which converges to the continuous-time model in a strong sense (in probability, in the Sko- 
rokhod metric), as the discrete approximating grid grows finer. This property is also especially 
useful in certain other applications, such as options pricing. The way is then open to using, for 
the COGARCH, similar statistical techniques to those already worked out for GARCH models 
and to illustrate this, an empirical investigation using stock index data is carried out. 

Keywords: COGARCH process; continuous-time GARCH process; Levy process; 
pseudo- maximum likelihood estimation; Skorokhod distance; stochastic volatility 

1. Introduction 

The modelling of time series in finance, economics and other fields frequently has to 
account for heteroscedasticity in the underlying data. Popular approaches to this problem 
use the autoregressive conditional heteroscedasticity (ARCH) model of Engle [6] and its 
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generalized version, the GARCH model of Bollerslev [4]. The main principle of time 
series modelling using GARCH is that a 'large' innovation (or unexpected change) in a 
period increases the variance of the innovation in the following periods. This constitutes 
a feedback mechanism whereby a single univariate series of innovations determines both 
the time series and its conditional variance structure. 

The GARCH concept has had a profound influence on time series modelling. Many 
other stochastic volatility models have been proposed, but the GARCH remains one of 
the easiest to conceptualize, is well established and thoroughly studied from a theoret- 
ical point of view and has been successfully applied in many practical situations. Some 
measure of the volatility (or risk) of an asset price is crucial in a wide variety of risk 
management areas (e.g., Jorion [11], Chapter 8.2, page 186, McNeil and Frey [19]) and 
in the valuation of financial derivatives (e.g., Ritchken and Trevor [23]). 

In practice, for various reasons, including weekend and holiday effects, or in tick- 
by-tick data, many financial time scries are irregularly spaced and this, together with 
options pricing requirements, in particular, has created a demand for continuous-time 
models. Nelson [21] suggested that GARCH models be seen as discrete approximations 
to diffusions. He showed that some standard GARCH models, when scaled in certain 
ways on an approximating grid, converge in distribution, as the grid grows finer, to 
a bivariatc diffusion process, the variance rate (or volatility) of which exhibits mean 
reverting behavior. Nelson's result served for some time as a justification for statistical 
inference of continuous-time models using GARCH as an approximation. 

However, in Nelson's setup, the limiting process involves two independent Brownian 
motions, one of which drives the volatility and the other the accumulated time series 
(which then becomes a stochastic integral) . This runs quite counter to the philosophy of 
the original GARCH paradigm, whereby a single univariate series of innovations drives 
both mean and variance equations, thus providing a feedback mechanism. It is possible 
to modify Nelson's diffusion approximation so as to obtain convergence in distribution 
to a process which is driven by a single Brownian motion; however, the limit then has 
a deterministic volatility and the GARCH features disappear (see Corradi [5]). As a 
further problematic aspect, Wang [26] showed that a GARCH model and its continuous- 
time diffusion limit are not statistically equivalent, except in the case of the deterministic 
volatility limit derived by Corradi. This means that parameter estimation and testing 
for an underlying continuous-time diffusion model with stochastic volatility cannot be 
accomplished using a GARCH approximation in discrete-time. 

Recently, Kliippclbcrg, Lindner and Mailer [14] introduced a continuous-time version of 
the GARCH model, which they dubbed the 'COGARCH'. In contrast to the approaches 
of Nelson and Corradi based on limiting diffusions, [14] starts with a pure jump Levy 
process and generalizes a discrete recursion which lies at the heart of the GARCH. In this 
way, the main characteristics of the original GARCH are preserved: a single univariate 
process drives both the volatility process and the integrated GARCH process itself, and 
the same sort of feedback mechanism is built into the continuous-time model, that is, 
a large change in the Levy process results in an increase of the volatility, as well as 
simultaneously increasing or decreasing the level of the process. 

In the present paper, we study approximations to a COGARCH and estimation of its 
parameters when the COGARCH is the underlying data generating process. We show 
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how the COGARCH can be obtained as the limit of an embedded sequence of discrete- 
time GARCH scries. This demonstrates that Nelson's bivariate diffusion limit is not the 
only possible limit of a sequence of GARCH models and is, perhaps, not even the most 
natural. Further, our approach suggests how statistical techniques developed for GARCH 
models can be carried over to the COGARCH, after appropriate rescaling, to match the 
discrete- and continuous-time parameter sets. This allows us, in particular, to overcome 
difficulties associated with the analysis of irregularly spaced data. To illustrate, we carry 
out an empirical investigation using ASX200 stock index data, and some simulations. The 
convergence of the discrete- to the continuous-time processes is shown to be in probability 
in the Skorokhod metric and is therefore stronger than the previously mentioned weak 
convergence results of Nelson and Corradi. 

While there are studies on discretely observed diffusions (see, e.g., [1] and [10] for recent 
references), very little has been done with jump processes in our context. But, recently, 
Kallsen and Vesenmayer [13] have obtained COGARCH as a weak limit of embedded 
GARCH series (see also [12]). Their approach is quite different to ours, proceeding by 
way of the infinitesimal generator of the bivariate Markov process representation of the 
COGARCH process. In our setup, the embedded GARCH models and the COGARCH 
model are defined on the same probability space and pathwise arguments are invoked 
when proving the convergence. There are areas of applications where the stronger con- 
vergence is essential, for example, in the pricing of American options (see [17] and their 
discussion and references). 

In related investigations, Muller [20] developed a Markov chain Monte Carlo estimation 
procedure for the parameters of a COGARCH, which is applicable to irregularly spaced 
data. However, it assumes quite detailed knowledge about the driving Levy process and 
is heavily computer intensive, so simulations using it are currently infeasible. Haug et al. 
[9] use a method of moments procedure for COGARCH parameter estimation, but this 
is not easily adapted for unequally spaced series. 

Our paper is organized as follows. Section 2 briefly recalls the GARCH and COGA- 
RCH models and the main convergence result, Theorem 2.1, is stated. In Section 3, an 
estimation procedure for the COGARCH parameters is proposed, applied to a financial 
data set and supported by a Monte Carlo study. In Section 4, we discuss the implications 
of our results, especially with reference to Wang's [26] far reaching observation. All proofs 
are contained in Section 5. 



2. Setup and convergence theorem 

To begin, we recall the definition of a continuous-time GARCH process, as introduced 
in [14]. On a filtered probability space (£1,F,F, (^t)t>o) satisfying the 'usual hypothesis' 
(sec Protter, [22], page 3), we are given a background driving Levy process L = (L(t)) t >o, 
that is, a real- valued, pure jump Levy process with characteristic triplet (7,0,n) and 
L(0) = 0. Thus, it has characteristic function satisfying 

Ee iei(t) =exp( it^e + t [ (e i9x - 1 - itel { | a; |< 1} )n(da;) ), t > 0; 

V JR\{0} J 
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see [2, 3] and [24] for detailed background and results concerning Levy processes. The 
Levy measure LI is a measure on the Borel subsets of R\ {0}, and 7 is a constant depend- 
ing on the truncation at 0; we choose the standard truncation lji^x}. The filtration 
(Ft)t>o is the completed natural filtration of the Levy process L. Note that no Brownian 
component is present in the Levy process; we show later how it can be included if desired. 
We suppose throughout that EL(1) = and EL 2 (1) = 1. 

Given parameters (j3,r),ip), with /3 > 0, r\ > 0, ip > 0, and a squarc-intcgrablc random 
variable (r.v.) tr(0) independent of L, the COGARCH variance process a 2 = (<r 2 (i))t>o is 
defined as the almost surely (a.s.) unique solution of the stochastic differential equation 
(SDE) 

da 2 (t) = (/3-r)o- 2 (t-))dt + ip<T 2 (t-)d[L,L](t), t>0, (2.1) 

where [L,L] is the bracket process (quadratic variation) of L (Protter [22], page 66). We 
then define the integrated COGARCH process G = (G(t)) t >a in terms of L and a as 

G{t)= [ a(s-)dL{s), t>0. (2.2) 
Jo 

We refer to [14] and [15] for detailed properties of G and a 2 . 
2.1. Approximating the COGARCH 

Our aim is to define a family of discrete-time processes, G n = (G n {t))t>o, n= 1,2,..., 
constructed from a GARCH(1, 1) process, which approximates the continuous-time pro- 
cess G. This allows us to take advantage of widely used inferential and other methods 
in time series modelling and econometrics for this well-established process class. After 
appropriate rescaling to match the discrete- and continuous-time parameter sets, G n will 
be shown to converge to G in a quite strong sense. 

The discretization is over a finite interval [0, T] , T > 0, and is operationalized as follows. 
Take deterministic sequences (iV ra ) n >i with linin^oo N n = 00 and = to(n) < t\{n) < 
■ ■ ■ < tjsr„ (n) = T, and, for each n = 1,2, . . ., divide [0, T] into N n subintervals of length 

Atj(n) := ti(n) — ti_i(n) for i = 1,2, . . ., N n . Assume At(n) := maxj=i jv„ Atj(n) — > as 

n — > 00 and define, for each n = 1, 2, . . . , a discrete-time process (G ! j,n)i=i,...,iV n satisfying 

Gj,„ = Gi_i,„ + CTi_i i „\/A^(n)e ii „, i = 1,2, . . . ,N n , (2.3) 
where Go, n = G(Q) = and the variance a 2 n follows the recursion 

a 2 n = (3AU(n) + (1 + ^A^(n) e 2 Je- f ' A *^"V 2 _ 1 ,„ I 1 = 1, 2, ... , N n . (2.4) 

Here, the innovations {si : n)i=i,...,N n , n ~ 1,2,..., are constructed using a 'first jump' 

approximation to the Levy process, as follows. Take a strictly positive sequence 1 > 

2 

m n I of reals satisfying lim^oo At(n)U (m n ) — 0, where H(x) — J, , >x H(dy) is the 
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tail of II. Such a sequence always exists, as lim^o x 2 H(x) = for any Levy measure. Let 
AL(t) = L(t) — L(t—), t > 0, AL(0) = 0. Fix n > 1 and define stopping times r iyn by 

r i> „=inf{te[ViW,^(n)):|AL(i)|>m„}, i = l,...,N n . (2.5) 

(Throughout, an infimum over the empty set is understood as being +oo.) r i n is the 
time of the first jump of L in the ith interval whose magnitude exceeds m n , if such a 
jump occurs. 

By the strong Markov property, (l{ Ti „<oo} AL(Ti^ n ))i=i,...,N n is for each n = 1, 2, . . . a 
sequence of independent r.v.s, with distribution specified by: 

n(dx)l {W>mn} _ e _ At4(n)5(mn))i x e K \ {0}, i = 1, 2, . . .,N n , (2.6) 
Il(m„) 

and with mass e -Ati (") n (" 1 ") at 0. These r.v.s have finite mean, Vi{n), and variance, 
£i(ri), say, since EL 2 (1) is finite. The innovations series (si, n )i=i,...,N n required for (2.3) 
is now defined by 

e>,n = {T "" <00} - : : '- — , <=1,2 JV„. (2.7) 

For each n = 1,2, . . ., the £j >T[ are independent with Eex.n = and Var(ex,n) = 1- Finally, 
in (2.4), we take a^ n = tr 2 (0), independent of the Si^ n . 

Remark 2.1. Equations (2.3) and (2.4) specify a GARCH(1, l)-type recursion in the 
following sense. In the ordinary discrete-time GARCH(1, 1) series, the volatility sequence 
satisfies 

^ = a + ba^ 1 eti+caf_ 1 (2.8) 

for constants a, 6, c. When the time grid is equally spaced so that Ati(n) = At(n), 
i = l,2,..., N n , (2.4) is equivalent to (2.8), after rescaling by At(n) and a reparametriza- 
tion from (P,ip,rj) to (a, b, c), and (2.3) becomes a rescaled GARCH equation for the 
differenced sequence Gi. n — Gi_i. n . More generally, with an unequally spaced grid, if the 
series are scaled as in (2.3) and (2.4), convergence to the COGARCH is obtained, as we 
show next. 

Embed the discrete-time processes G. jTl and o~? n into continuous-time versions G n and 
cr 2 defined by 

G n (t):=Gi,„ and a- 2 (i) := cr? n , when t e [U-i(n),U(n)),0 < t < T, (2.9) 

with G„(0) = 0. The processes G n and a n are in D[0,T], the space of cadlag real- valued 
stochastic processes on [0, T] . Recall that the Skorokhod J\ distance between two, Re- 
valued processes U, V, each in I8 d [0, T] (the space of cadlag Revalued stochastic processes 
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on [0,T]), is denned by 

p(U,V)=m{\ sup \\U t -V m \\+ sup \X(t)-t\\, (2.10) 

{0<t<T 0<t<T J 

where A is the set of strictly increasing continuous functions with A(0) = and A(T) = T 
(Gihman and Skorokhod [7], page 470). We can now state our main result for this section. 

Theorem 2.1. In the above setup, the Skorokhod distance between the processes (G,a 2 ) 
defined by (2.1) and (2.2), and the discretized, piecewise constant processes (G n ,o'^) n >i 
defined by (2.9), converges in probability to as n — > oo, that is, 

p((G n ,al),(G,o- 2 ))^0 as n ► oo. (2.11) 

Consequently, we also have convergence in distribution in D[0,T] x D[0, T] : (G n , a 2 ) =i 
(G, a 2 ) as n — > oo. 

Remark 2.2. (i) The derivation in [14] of the COGARCH employs an auxiliary Levy 
process X = (X(t)) t >o constructed from L, r\ > and ip > 0: 

X(t)=rft- log(l + ^(AL(s)) 2 ), t>0. (2.12) 

0<s<t 

AT is a spectrally negative Levy process of bounded variation. (In (2.12), we have adopted 
the parameterization of [9], which differs somewhat from that of [14]; the latter used 
< 5 < 1 whereas we use 77 > 0, with e~'' = S, and used X/6, for another parameter 
A > 0, whereas we use ip.) Using Ito's lemma, we can verify that the solution to (2.1) can 
be written in terms of X as 

a 2 {t)=(^3 j\ x{s) ds + a 2 {Q)^je- x{t \ t>0. (2.13) 

This shows cr 2 (t) to be a kind of generalized Ornstein-Uhlenbeck (OU) process (cf. [18]), 
parameterized by ((3,r],ip) and driven by the process L. 

(ii) Our procedure can be generalized to include a Brownian component. Let B = 
(B(t)) t >o be a standard Brownian motion and L an independent, pure jump Levy process 
with finite variance, and define iJ = <;B + L, where ? > 0. Using in place of L in (2.1) 
and (2.2) introduces a diffusion component into the COGARCH. Center and scale so that 
EL' (1) = and E(Lt(l)) 2 = <; 2 + J x 2 Il{dx) = 1. The convergence result of Theorem 2.1 
extends to this setting if we modify the definition of the process X in (2.12) to 

Xi(t) = (r)-<p<; 2 )t- log(l + ^(ALt( s ))2) 7 t > . 

0<s<t 

The term tpc 2 results from the bracket process of B. For a related convergence result, see 
Theorem 2.2 of Szimayer and Mailer [25]. 
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(iii) Now suppose that the modified COGARCH in the previous remark is, in fact, 
driven by a pure diffusion, that is, = B. The COGARCH then reduces to the process 
obtained in the limit by Corradi [5] and the GARCH approximations converge to Cor- 
radi's deterministic volatility limit. In this simplified situation, the GARCH models and 
the diffusion limit are statistically equivalent, as shown by Wang [26]. 

(iv) We have restricted ourselves throughout to convergence on the compact interval 
[0,T]. This is true for every T > 0, although the approximating processes depend on 
T in a non-essential way. It is not difficult to modify our setup slightly so as to get 
approximating processes which converge to (G,er 2 ) uniformly on compacts (u.c.p., in the 
terminology of [22], page 57) and, consequently, also in O[0,oo) x ID)[0,oo) [16]. We omit 
the details here. 

Theorem 2.1 is proved in Section 5. Next, we illustrate how to use the convergence 
result to analyze irregularly spaced time series data. 

3. GARCH analysis of irregularly spaced data 

In this section, we apply the insights gained by our discrete approximation of the 
continuous-time GARCH process to suggest a method of fitting the model to unequally 
spaced times series data. We build on the well-understood methodology developed for 
the discrete-time GARCH. 

Suppose we have observations G(ti), = t a < t x < ■ ■ ■ < t N = T, on the integrated 
COGARCH as defined and parameterized in (2.1) and (2.2), assumed to be in its 
stationary regime. The {ti} are assumed fixed (non-random) time points. Let Yi = 
G(ti) — G(ti_i) denote the observed returns and let Ati := U — £s_i. From (2.2), we 
can then write 

Yi= (* a(s-)dL(s), (3.1) 

where L is a Levy process with Ei(l) = and EL 2 (1) = 1 assumed. 

Our aim is to use a pseudo-maximum likelihood (PML) method to estimate the param- 
eters (/3, ?7, ip) from the observed Yi,!^, . . . , Y/v- To derive the pseudo-likelihood function, 
observe that, because a is Markovian ([14], Theorem 3.2), Yi is conditionally independent 
of Yi_i, Yi_2, . . • , given T ti _ x . We have £(1^ | T ti _ x ) = for the conditional expectation of 
Yi, and, for the conditional variance, 

A =- e«^„_,) = Uu.,) - -L.) f^lzl) + i^L. (3 . 2) 
V v-fJ\ v-v J v-<p 

Equation (3.2) follows from the calculation in the third display on page 618 of [14]. To 
ensure stationarity, we take Ecr 2 (0) = f3/ (77 — ip), with 77 > tp, in that formula and, in our 
setting, r R y 2 n(dy)=EL 2 (l) = l. 
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Applying the PML method, then, we assume the Yi are conditionally N(Q,pf) and use 
recursive conditioning to write a pseudo-log- likelihood function for Y\ , Y%, . . . , Yn as 

1 N /Y 2 \ 1 N N 
C N = £ N (P,ip, V ) = - 2E(jr -^ElogG^-ylog^Tt). (3.3) 

»=i \ Pi y i=i 

We must substitute into (3.3) a calculable quantity for pf, hence we need such for <7 2 (fj_i) 
in (3.2). For this, we discretize the continuous-time volatility process, just as was done 
in Theorem 2.1. Thus, (2.4) reads, in the present notation, 

o\ = pAU + e-^a^x + ^^ u Yf . (3.4) 

(3.4) is a GARCH-type recursion, so, after substituting af_ x for a 2 (U-i) in (3.2), and the 
resulting modified pf in (3.3), we can think of (3.3) as the pseudo-log-likelihood function 
for fitting a GARCH model to the unequally spaced series. 

The recursion in (3.4) is easily programmed and, taking as starting value for cr 2 (0) the 
stationary value [3/ (rj — ip), we can maximize the function Cpj to get PMLEs of (/?, rj, ip). 
In the next two sections, we apply this estimation approach to a data set of returns on 
the ASX200 share index and use simulation to study the properties of the estimates thus 
obtained. An alternative approach to estimating the COGARCH parameters based on 
the method of moments (MM) has been devised by [9]. Their results provide a baseline 
against which we can compare our procedure via simulations. By choosing suitable values 
for the simulation parameters, we are able to apply Theorem 2.1 in [9] to get moment 
estimates by their method as well and thus to compare their MM estimates with our 
PML estimates. However, the Haug et al. method only works for equally spaced series, 
so we have to restrict to this case to make the comparison. 



3.1. Application to ASX stock index stock data 

We used the PML method to fit a COGARCH model to a series consisting of 2529 log 
returns of the ASX200 stock index as listed on the Australian Stock Exchange taken once 
per trading day, March 1994 to March 2004. The data are shown in Figure 1. 

Because of weekends and public holidays, the data are irregularly spaced, with the 
following frequencies of the inter-observation times: 

At 12 3 4 5 6 

frequency 1991 13 483 24 17 1 

For example, At = 3 corresponds to a regular weekend without additional public hol- 
idays. The data contain 2529 distinct values of the index returns, observed over a total 
time interval of T = 3653 days, and there are six distinct values of Afcj. Simulations 
showed that instead of using equation (3.2) directly, one can use its first-order approxi- 
mation, pf = a 2 (ti-i)Ati, without worsening the quality of the estimates. 
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The PML estimates were computed with an implementation of the Nelder-Mead opti- 
mization algorithm in C++. To avoid getting caught in a local (rather than the global) 
maximum of the pseudo-likelihood function, we used ten different starting simplices for 
each data set. The Nelder-Mead procedure was stopped when an accuracy of 10~ 14 in 
the location of the maximum of the function was reached. The approximate PMLEs were 
as follows (/? is multiplied by 365 to put it on an annualized basis, then the square root 
is taken so as to give a volatility rather than a variance estimate; approximate standard 
errors, calculated from the second derivative of Cn are in brackets): 

^365/3 = 0.0237(0.0027); £=0.0685(0.0095); ?/ = 0.0847(0.0085). 

Note that our estimates satisfy the stationarity condition rj > (p. 

These estimates imply a long-run volatility value of (365/3/(7? — £)) 1 ^ 2 = 18.58% p. a. 
By comparison, the actual standard deviation of the returns was 15.54% p. a. Estimates 
of the process (a 2 (t))t>o at the observed time points can be calculated from 

of =P + (1 - rj)aU + <p(G(ti) - G(U-i)f 

(cf. [9], equation (3.2)). Figure 2 shows the squared log returns for the first 1000 obser- 
vations and, for comparison, the estimated annualized volatility. 

To see how the volatility process evolves as the value of At, changes, we computed 
estimates of the transformed, rescaled, parameters uoi :=/3(Ai;) 2 , §i := ipe~ vAti Ati and 
Ki := c~ I,Ati , which correspond to the discrete GARCH(1, 1) parameterization. These are 
listed in columns 1-6 of Table 1 (but, again, we annualize the u> estimates and take the 
square root). Column 7 of Table 1 contains the GARCH(1, 1) estimates obtained by treat- 
ing the log returns as if they were equally spaced in time. As one would expect, treating 
the data as if they were equally spaced gives estimates corresponding approximately to 
a weighted averaging over the estimates in columns 1-6 of Table 1. 

Quite commonly, financial analyses treat weekends or public holidays by assuming the 
data are contiguous over the missing period, thus, in effect, assuming that no information 
relevant to the market is transmitted on the missing days. This is not generally the case, 
of course, since, for example, trading in Australian stocks may be halted on a certain day 
on the ASX, while some or many of these stocks may nevertheless be traded on other 
international markets which are open at the time. While the corresponding information 



Table 1. Estimated parameters for various period lengths (columns 1-6) and GARCH estimates 
treating the data as equally spaced (column 7) 



At, 


1 


2 


3 


4 


5 


6 


GARCH(1, 1) 


(365S0 1/2 
?* 


0.0237 
0.0629 
0.9188 


0.0473 
0.1157 
0.8442 


0.0710 
0.1594 
0.7756 


0.0946 
0.1953 
0.7126 


0.1183 
0.2243 
0.6548 


0.1419 
0.2472 
0.6016 


0.0382 
0.0962 
0.8434 
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ASX200 Index (1994 to 2004) 

36D0| 1 1 1 1 1 1 1 1 1 1 1 




1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 

Time 

Figure 1. ASX200 stock index taken once per trading day, March 1994 to March 2004. 



flow is probably not of the same strength as for a regular day's trading, we expect there 
will be some influence, and although our analysis above allows for unequally spaced time 
periods, it implicitly assumes that all data carry the same weight of information. But 
more generally, it can be argued that we should weight the observations in some way. 

To investigate this, we extended the analysis using a function w(-) to weight the Aij, 
constraining the sum of the weights to remain the same as for the original analysis, that 
is, 

TV N 

j2w(^u)=J2 At i= T - ( 3 - 5 ) 

i = l i=l 

In this setup, the function u>o(Ai) := T/N represents an extreme case where the irregular 
spacing of the data is ignored, while the function wi := id corresponds to our previous 
analysis where only the irregular spacing was taken into account. Another extreme case 
is to allow a separate parameter for each distinct value of Ai, rather than using the value 
of At itself. For our data, this means fitting five extra parameters. 
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Figure 2. Top: squared log returns of ASX200 for the first three years (f096 days). Bottom: 
corresponding estimated annualized volatilities for the ASX index data. 



Allowing the five extra parameters described gives a much better fit: the likelihood 
increases from 8649.61 for the original analysis to 8723.92. However, some of the extra 
parameters are very poorly determined (there is only one observation at At = 6, for 
example), and inspection of the parameter estimates suggested fitting the 1-parameter 
function ii^Ai) :=7log(At) + 0(7), where c is defined, depending only on 7, so that 
condition (3.5) is fulfilled. Replacing all Ati by W2(A£,-) and repeating the PML estima- 
tion, we find that the likelihood reduces only non-significantly from 8723.92 to 8721.00, 
still indicating a much better fit of the model to the data than the original unweighted 
model. 

This application is by no means intended to be a sophisticated analysis of the ASX 
data set, which is beyond the scope of this paper. We use this irregularly spaced data 
example simply to illustrate the possibilities. Our main point is that the COGARCH 
model can be fitted directly to unequally spaced data exactly as it is, without the need 
to force it into an equally spaced setup in some way. Further, an approximation via the 
common GARCH(1, 1) model is easily adapted to the irregularly spaced case. 
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Table 2. Means over 1000 simulated estimates of j3, <p and n, average biases of the estimates, 
their mean absolute errors (MAE) and their root mean squared errors (RMSE) around the true 
value, for a time series of 5000 equally spaced observations from a COGARCH process driven 
by a compound Poisson process, using the PML method and the method of moments (MM) 



True 








<P 




n 






1.0000 




0.0425 




0.0600 




PML 




MM 


PML 


MM 


PML 


MM 


mean 


1.2356 




1.2487 


0.0337 


0.0448 


0.0554 


0.0672 


bias 


0.2356 




0.2487 


-0.0088 


0.0023 


-0.0046 


0.0072 


MAE 


0.3799 




0.4372 


0.0099 


0.0130 


0.0125 


0.0182 


RMSE 


0.5393 




0.5892 


0.0117 


0.0146 


0.0156 


0.0231 



3.2. Simulation study 

In this section, our PML method is applied to simulated data sets, first with regularly 
spaced observations to allow a comparison with the results of [9], then with irregularly 
spaced data to see how much this influences the quality of the estimates. 

For the first run, we simulated 1000 COGARCH data sets in which T = 5000 and 
observations occur at times t = 1, 2, . . . , 5000. Thus, N = 5000, At = 1 and the ratio 
T/N = 1 approximates that of T/N = 3653/2529 = 1.44 in the ASX data. As driving 
Levy process L, we chose a compound Poisson process with standard normal jump sizes 
and jump rate A = 1. For the 'true' COGARCH parameters we took /3 = l,<p = 0.0425 and 
n — 0.06. These values allow for the application of the method of moments to estimate the 
COGARCH parameters since all conditions of Theorem 2.1 in [9] are satisfied. To each 
data set, we applied the PML method to obtain estimates of f3, tp and ?/. In addition, 
we computed moment estimates using the method of [9]. The calculations were done 
in S-PLUS. Table 2 summarizes the results. It gives the mean over the 1000 simulated 
parameter estimates, the average bias of the estimates, their mean absolute error (MAE) 
and their root mean squared error (RMSE) around the true value, for both the PML and 
MM approaches. The standard errors of simulation of the values in the table are very 
small, as expected for a sample of size 1000, being less than 1%, for example, for the 
parameter estimates, so we do not report them. 

The RMSE is a comprehensive error metric, combining the variance of the estimator 
around its true value, and its bias. Table 2 shows that our method reduces the RMSE 
of the moment estimates by 8.5% for j3, by 20.0% for ip and by as much as 32.5% for rj. 
Note, however, that significant bias remains in all parameter estimates for both methods. 

Next, we investigate how the quality of the PML estimates is affected when analyzing 
irregularly, rather than equally, spaced data. We simulated 1000 COGARCH processes 
according to the circumstances of the ASX data, thus, for 2659 values of t, occurring 
with the frequencies specified in Section 3.1 and with (3, ip, and -q taken close to the PML 
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estimates given there. Thus, we assume that we observed the COGARCH processes at 
exactly those times at which we observed the ASX data, encompassing a total time 
interval of T = 3653 days. Table 3 contains similar information as Table 2 for these 
simulations; in addition, in brackets in the last row are the relative RMSEs from the 
previous simulation study. These give some idea of how the quality of the estimates is 
affected by decreasing the number of observations from 5000 to 2529 and using irregularly, 
instead of equally, spaced data. In fact, we see that the quality of the estimates is not a 
great deal worse than from a data set with twice as many equally spaced observations. 

4. Discussion 

The GARCH methodology is now so well known and widely available that the model, or 
some variant of it, is fitted to economic or financial data almost as a matter of routine. 
One of the motivations for our present investigation, and that of Kliippclberg et al. in 
[14], in initiating the continuous-time model was with a view to applications such as the 
analysis of irregularly spaced time series, and options pricing. 

Nelson's research [21] suggested that his limiting diffusion process, or some variant of it, 
might be useful as an assumed data generating process in a practical situation. (Leaving 
aside, at this point, considerations of the appropriateness of the model as a description 
of the data at hand, in which returns are probably not normally distributed, processes 
may have jumps, etc.) Assuming this, then, a very natural procedure is to consider fitting 
a GARCH model to (necessarily discrete) observations on the underlying process, then 
to substitute the resulting parameter estimates into a discrete option pricing algorithm 
(such as, e.g., the method of Ritchken and Trevor [23]), with the intention that the price 
thus obtained converges to the 'true' price, as it would be obtained from the underlying 
continuous-time model, when the mesh size of the approximation tends to 0. 

Table 3. Means over 1000 simulated estimates of /3, ip and 77, average biases of the estimates, 
their mean absolute errors (MAE) and their root mean squared errors (RMSE) around the true 
value, for a time series of length T — 3653 with 2529 irregularly spaced observations from a 
COGARCH process driven by a compound Poisson process, using the PML method. Last line: 
relative RMSE = RMSE/true parameter value and corresponding relative RMSEs from Table 2 
(PML method) in brackets 



True 


P 




V 


1.5000 


0.0690 


0.0850 


mean 


1.9573 


0.0516 


0.0718 


bias 


0.4573 


-0.0173 


-0.0132 


MAE 


0.6913 


0.0197 


0.0202 


RMSE 


1.0100 


0.0227 


0.0242 


rel. RMSE 


0.6733 (0.5393) 


0.3291 (0.2744) 


0.2848 (0.2606) 
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However, this plan goes awry at the first step because the potential nexus between the 
discrete GARCH estimation and the corresponding continuous-time parameters does not 
exist in a diffusion setting. This follows from Wang's [26] result, which shows that the 
GARCH estimates cannot identify the parameters in the continuous-time model, except 
in the degenerate case of the constant volatility model of Corradi [5] . This complication 
extends beyond options pricing methodologies, of course, but we stress that application 
because the discrete- to continuous-time step is transparent and crucial there. 

In contrast, the COGARCH offers a class of models which appear as natural and ap- 
propriate analogs of the discrete GARCH models. The limit of our discrete-time GARCH 
approximating sequences is, in general, a jump process, not a diffusion and the close cor- 
respondence between the discrete- and continuous-time GARCH models makes it very 
plausible that they are statistically equivalent and will hence lead to consistent estima- 
tion. The evidence from the simulation study in Section 3.2 lends support to this con- 
jecture. Nevertheless, it remains to be established, as do other large sample properties of 
the estimators and tests suggested by our approach. We leave this for the future. 



5. Proofs 
Preliminaries 

Our pathwise construction relies on a 'first-jump' approximation to a Levy process devel- 
oped by Szimayer and Mailer [25], which we present here in a general notation. Let Z = 
{Z(t) :t > 0} be a Levy process with characteristic triplet (j z ,0,U Z ), where 7 Z relates 
to the standard truncation of H z in [—1, 1], and Z(0) — 0. Consider Z on the compact 
interval [0,T], which is divided into N n subintervals of length Atj(n) := ti(n) — ti-i(n) 
for i = 1,2,..., N n , where = to(n) < < • • • < ijv„ (n) = T is a deterministic par- 

tition of [0,T] and (N n ) n >i is a sequence of integers with limn^oo N n — oo. Assume 
At(n) := max^^...,^ Aij(n) ^ as n — » oo. Let (m„)„>i be a positive sequence such 
that linin^oo m z = and define stopping times 

T? n = inf{t G [tt_i(n),t((n)) : \AZ(t)\ > mf } for i = l,...,N n , (5.1) 

where AZ(t) := Z(t) - Z{t-). Define the 'first jump process' (Z n (t) :0<t<T) by 

Z n (t) = J2l {T z < n AZ(^) + i(V- / zU z (dz)) forO<t<T. (5.2) 

~{ '- n V JmZ<\z\<\ J 

The next proposition shows that, provided At{n) and m z converge to at appropriate 
rates, the processes Z n converge in probability to Z, uniformly for t £ [0,T], as n — > oo. 

Let n (z) = II z {[-z, z] c }, z > 0, denote the tail of H z and assume n (0+) > to avoid 
trivialities. 
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Proposition 5.1. Suppose lim^oo */ At(n) H (m%) = 0. Then, (i) we have 

sup \Z n (t)—Z(t)\ — > as m oo. (5-3) 

0<t<T 

//, in addition, E|Z(l)|<oo and EZ (1) = , we may replace ~f Z — / m z<|,|< 1 2ll (cLs) by 
— J|z|>m z ^n^(d^) in (5.2), and (5.3) remains true. 

If, further, we have E(Z(1)) 2 < oo, then the convergence in (5.3) is, in fact, in C 2 , 
that is, linin^oc \\Z n (t) - Z(t)\\ 2 = 0. 

(ii) If Z is of finite variation with jump component Z d (t) := ^o<s<t ^^(s), then 

as n — > 00. (5-4) 

Proof, (i) The claimed results follow immediately from Theorem 2.1 of [25]. The setup 
there is identical, except that the discretization of the state space they allow for is not 
needed here. So, in their theorem, we formally set M(n) = 00 and A(n) = 0, and identify 
L„(t) in their notation with Z n (t). Our equation (5.3) then follows from equation (2.11) 
of [25]. 

(ii) For Levy processes of finite variation, truncation of the Levy measure near is 
not necessary and the truncation function l{u<u can be dropped from the formulation. 
The same holds for the approximation scheme. Thus, (5.4) follows from (5.3). □ 

Proof of Theorem 2.1. This proceeds in several steps. In parts (i)-(iii), the approxi- 
mation procedures for L(t), o- 2 (t) and G(t) are outlined. The convergence, as stated in 
the theorem, is then shown in part (iv). 

Pari (i): Approximation procedure for the underlying process L(t) 

The approximation procedure requires two stages. On one hand, we need a discrete 
GARCH approximating process satisfying (2.3) and (2.4). This does not come directly 
from the kind of approximation used in Proposition 5.1, but rather from the process L n 
defined by 

L n {t) := V&ti(n)e hn , < t < T, n = 1, 2, . . . . (5.5) 

i=l 

Here, recall the e^n defined in (2.7). Recall, also, the first jump times Ti >n defined in (2.5) 
and set r* n = T.^ n A ti(n). Define the counting process 

N n (t) := #{i e N : r*„ < t}, < t < T, with JV n (0) = 0. 

N n (t) increases by 1 in each subinterval (fj_i(n),tj(n)], i = l,2,...,n, at the first time 
Ti. n in the interval at which L(t) changes in magnitude by more than m n , or at ti(n), if 
there is no such change. Note that, finally, N n (t Nn ( T )(n)) = N„(T) = N n . 



sup 

0<t<T 



^2l {Tf ^ t} AZ(rfj-Z d (t) 
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As an intermediate step, we also need the sequence of processes denned by 



N n (t) 



L n (t) = ]T l {TiiB<oo} AL(T itn ) - 1 f xll(dx), < t < T, (5.6) 

to which we can apply Proposition 5.1. We have E|L„(1)| < oo, so, by Proposition 5.1, 
L n , as centered, converges in probability, uniformly on [0,T], to L. Thus, to show that 
L n — > L in probability, uniformly on [0,T], we need only control the uniform distance of 
L n from L n - 

To estimate this, write L in terms of Si^ n as 

N n (t) 

L n(t) = ^2 ( £ i,n&( n ) + v i( n )) - 1 / arll(da;). 

i=1 J\x\>m n 



Here, 



Ui(n) :=E(l{ Ti n<oo} AL(r^„)) 



and 



£ 2 (n) := Var(l {T . n<oo} AL(T ij „)) = 



I _ e -A*«(n)n(m n ) 

II(m„) 

I _ e -At 4 (n)n(m») 



xJI(dx) 



\x\>r 



IL(m n ) 



\x\ >m ri 



.x 2 n(dx) -f 2 (n) 



are calculated from (2.6). Their asymptotic behaviors as n — > oo are 



max — — > and max 
i=i,-,JV„ y/AUin) i=l,...,N„ 



AU(n) 

To see this, use the inequality 1 — cT x < x, x > 0, and write 



- 1 



(5.7) 



V Aij(n) 



xll(dx) + / xn(dx) 



|x|>l 



<0(VAt(n))(n(m„) + 



xll(dx) 



x|>l 



= 0(^/At(n)n 2 (m„)) + 0(y/At(n)). 



Since limra—joo Ai(n)n (m n ) = by assumption, we get the result in (5.7) for Vi(n), and 
then the result for f 2 (n) holds since J a; 2 II(da:) = var(L(l)) = 1 by assumption. 
From (5.5) and (5.6), we have 



N n (t) 



N n (t) 



L n (t)-L n (t)= (VA*i(n)-6(n))ei, n - ^v^nj+t f xU(6x). (5 

i= 1 ,; = i •'k|>m„ 
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Write Y^i=i^ Aij(n) =t — r n (t), where < r n (t) < At(n) a.s., and use the inequality 
< x — 1 + c~ x < x 2 /2, x > 0, and the assumption that linin^oo At(n)H (m n ) = to get 

N n (t) 



Mj(n) — t / xll(da 

i = l J\x\>m n 



N n {t) 



J2 (A*i(n)n(m„) - 1 + e - At -(") n (™")) + r*(t) 



yn(m„) . =1 

< 0(n(m„)) ^ (A^(n)) 2 + At(n) / rfl(d 

\ i=l / ^|»l>m n 

= (0(v/Ai(n)) v // A<(n)!T 2 (m„) + At(n)) 



| a; | >m n 



xn(da;) 



scll(dx) 



'(VA*W) 



zii(dx) 



|x|>r 



Note that lim^oo y/At(nj\ Ji x i >m xll(dx)| =0 was shown in the proof of (5.7). 
Also, since the (si, n )i=i,...,N n are independent with means and variances 1, and 
^AMt) At^n) < T the variance of the first term on the right-hand side of (5.8) is not 
larger than 



J2 lVA*iW-6(n)i 2 <r. = max 



&(n) 



as n — > oo 



(5.9) 



by (5.7). These arguments show that sup 0<t<T \ L n (t) — £ n (t) — > 0, as n — > oo, as claimed, 
so we deduce from Proposition 5.1 the required convergence in probability, uniformly on 
[0,T], of L n to i. 

Part (ii): Approximation procedure for the variance process o~ 2 (t) 

Having defined the £j lfl in (2.7) and given the parameters (f3,r],ip), the variance process 
er 2 is constructed using the GARCH(1, 1) equation (2.4). This can then be iterated (cf. 
[8, 21]) to get the explicit representation 



j=i fc=j+i 

+<n e ^ Atj(n) ( i+ ^(")4n) 

3=1 



(5.10) 
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for i = 0,1, ... , N n (take = an( i IIj-=i+i = ■"•)■ Define a discrete-time process 

i 

X iin = v t i {n)-^log{l + (pAt j (n)el n ) for n = 1,2,..., (5.11) 
i=i 

then define its continuous-time counterpart by interpolation: 

N n (t) 

X n (t):=X Nn{t)tn = r]t Nn{t) (n)- J2 log(l + vA*i(»)e?,„), < t < T. (5.12) 

i=l 

Note that X n (r* n ) = Xi, n . Again, we wish to use the convergence result in Proposi- 
tion 5.1, so we specify an auxiliary version of X as follows: 

N n (t) 

X n (t) = r,t Nn(t) (n) - J2 lo g(! + ^ 1 {r ! ,„<oo}(Ai(r l ,„)) 2 ), < t < T. (5.13) 

i=l 

X is an approximation to X as defined in (2.12), and X is of finite variation, so, from 
Proposition 5.1, we have 

sup \X(t)-X n (t)\ ^ 0, asrwoo. (5.14) 

0<t<T 

To check this, just compare (2.12) and (5.13), note that lim^— >ooiiV„(i)fa) = ^ an( i se ^ 
Z(t) = X(t), ~Z{t) = X(t) and mf = log(l + 99m 2 ) m ( 51 ^ Thcn T z^ = and part ^ 
of Proposition 5.1 gives (5.14). 

To establish the closeness of X n to X, write 

|log(l + tpAU(n)e% tn ) - log(l + yl {Ti , n<0 o} (AL{n in )) 2 )\/ip 

< |A«i(n)e? in - l { r j .„<oo}(Ai(r ilB )) 2 | = |A*ifa)e?„ - (e^fa) + ^fa)) 2 | 

= l(Atifa) -e?fa)K„ - 2&(nK„^(n) - */f (n)|. (5.15) 

A similar argument as in (5.9) shows that the right-hand side of (5.15), when summed 
over 1 < % < N n (t) , tends in probability, uniformly on [0, T] , to 0. Thus, sup 0<4<r \X n (t) — 

X n (t)\ — ► as n — > oo and, using the triangle inequality, we conclude from (5.14) that 

sup \X(t)-X n (t)\ ^ asn-^oo. (5.16) 

0<t<T 

Now we are in a position to show that an interpolated version of a 2 approaches <J 2 {t), in 
the limit. Substituting in (5.10) for Xj, n from (5.11), we can write, recalling cr 2 „ = a 2 (0), 

i 

4 n =Pe- x ^J2 At ^ eXi ' n +^(0)e- X -- (5.17) 

3=1 
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Define the piecewise constant process 



AT„(t) 



d*(t) :=l3c- x - {t) c x ^<^AU(n)+<T 2 (0)e- XnW , 0<t<T. (5.18) 



Now, by (5.16), e~ Xn converges in probability, uniformly on [0,T], to e~ x . To deal with 
the summation in (5.18), note that, except possibly for the last interval, where i = N n (t), 
we have X n (r* n ) = X n {U{n)) since X n can change value only at times t = r* n and is 
constant elsewhere. Thus, 



sup 

0<t<T 



JV„(t) 



J2 At i (n)(e x »Kn)_ e ^^("))) 



i=l 



<2At(n) sup e x " (t) < 2e r > T M(n)^0 

0<t<T 



(note that X n {t) is bounded above by rfT, as is X(t), for < t < T). Now, estimate 



sup 

0<t<T 



N n (t) 

At t (n)(e x ^ n » -e x " (*<»») 



iV„(T) 



< e x(u(n)) Ati(n)|l-e 



X„(t i (n))-X(t i (n))| 



<Te" T sup |l-e x -W- x W[. 

0<s<T 

By (5.16), the last expression tends to in probability. Finally, note that the discretely 
formed integral ^f" (i) e x( * i(ra)) Aii(n) converges in probability, uniformly on [0, T] . to 
the integral J ' c^' s ' ds by Theorem 21, Chapter II, of [22]. Hence, we deduce 



sup 

0<t<T 



N n (t) _ w 



From (2.13) and (5.18), we now conclude that 



al(t) ^ p e - x ^ / e x ^ds + a 2 (0)e- x ^=a 2 (t), 
Jo 



(5.19) 



uniformly for < t < T. 
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Part (Hi): Approximation procedure for the COGARCH process G(t) 

In this section, we define a discrete integrated GARCH sequence G n and prove its con- 
vergence to the continuous-time COGARCH process G. We take Gi^ n as in (2.3), thus 



Gi,n = ^2 a j-i.n\J A^(n)£j,„, i = 1,.. -,N n , 

3=1 

with Ej t7l and cr| n satisfying (2.7) and (5.10). Interpolate to get a continuous-time version: 

N n (t) 

Gn(t)= <ri-i,nV&k&)£i,n, 0<t<T. (5.20) 

i=l 

By the definitions of a n and L n in (5.5) and (5.18), we can write 

Ar„(t) t 

G n (t)= V ?„(r;_ 1?l )V A ^W^.n= / 5?„(s-)di„(s), 0<i<T, 
i=i 70 

so it is plausible that G n (t) —y G(t) = f* a(s~) dL(s), uniformly for t G [0, T] . We confirm 
this as follows: 

N n (t) N n (t) 

G n (t)= i^(T*_ 1:n )-<7(T*_ l n )]y/At l (n)eu l + ^ cr(T*_ hn )DL n (T* n ) 

i=l i=l 

= \?n( T i-l,n) - cr ( 7 f-l,n)]V / ^iW e i,n 

i=l 

JV„(i) W„(t) 

+ E ^« 1 ,J(^„(r* n )-^L„(r*J)+ J2 <r(rU tn )DL n (T* n ), 

i=l i=l 

where DL n { T * n ) := Z„(t*„) - L„(r?Li iTl ) and L>L„(r* n ) := L n { T * n ) - L n { T *_ l n ) for i = 
1,2,..., Af n . Write the last expression as 

G n (t) = M)V n (t),n + QjV„(t),n + RN n (t),n, ( 5 -21) 

where 

M i; „ = ^p n (T t *_ lin ) - o-(7fc_ liri )] y/At k (n)s k ,„ = E a fc _i jnA / M k {n)e k>n 

k=l fc=l 

and 

-X!' 7 ( T fc-i,»)( £) ^( T M) - DL n(rt,n)) =Yl (J ( T k-i,n) D Km say. 
fc=i fc=i 
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First, we show that Mj ; „ is uniformly asymptotically negligible. We plan to use Markov's 
and Doob's inequalities, but cr 2 (i) does not necessarily have a finite expectation under 
our assumptions, so we need a truncation argument. For v, C > 0, write 



,„|>u)<P( max \M hn \>v, sup \a n {t) - a(t)\ < C ) 

/ \i=0,...,N n 0<t<T J 



max I M, 

i=0,...,N n 



+ P sup \a n (t) - a(t)\ >C 

\0<t<T 

The second term on the right tends to as n — > oo by (5.19). The first term on the right 
is bounded by 



max 

i=l,...,N„ 



^ afe-i,nl{|a fc _ 1 ,„|<c}'\/ At k (n)e k . 



fe=i 



> V 



max \Mj„\>v), 

i=l,...,N„ [ ' 



say. 



For each n > 1, {M^ nl T T * )i=o,...,N n is a martingale. Use Markov's inequality and Doob's 
maximal quadratic inequality to obtain 



max \M^ n \ > v I < -^E( max (M, 



i=l,...,JV„ 



Vi=l,...,jV„ 



N„ 



fc=l 



4T„ 



<^E(minl sup |a„(<) - a(i)| 2 , C 2 



0<t<T 

By (5.19) and the dominated convergence theorem, the above expression tends to in 

F 

probability. Hence, max^i^ ^jv^ Mj. n — > as n — > oc. 

Next, we deal with Q^„. From (2.12), we have X(s) — X(t) < J2s<u<t ^°S(^ + 
^(AL(ii)) 2 ) when < s < i, so, from (2.13), we get 

E( sup CT 2 (i)) < (/3/( / 9 + Ecr 2 (0))e^ T =:C*. 

\0<t<T / 

Further, for each n > 1, (Qi^-Tv* )i=o,...,Af„ is a martingale, and we can use a similar 
argument as for Mj ltl . Chcbyshcv's inequality and Doob's maximal quadratic inequality 
give 



max I Qi „ I > v I < — E 

=i N. 



' N„ 
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A Nn 

An upper bound for this is 



' N„ 



rE^sup^^j Varf ^A,„J < — Var(X»(T^ BjB ) - I(4„,J). 

AC* 



< sup E\L n (t)-L{t) 

V 0<t<T 



2 



From (5.8), we can readily obtain that sup 0<t<T E|I/ n (i) — L(t)\ 2 — > as n — > oo. Also, 
from Proposition 5.1, sup 0<t<T E|L„(t) — L(t)\ 2 — > as n — > oo. So, we have shown that 
the first and second summands in (5.21) are op(l) as oo. 

The third summand in (5.21), RN n (t),m is a discrete stochastic integral with random 
partition (T* ra ), = o,...,jv„! where the mesh of the partition is bounded by 2At(n) and 
therefore tends to a.s. Hence, Theorem 21, Chapter II, in [22] can be applied to show 
that this expression converges in probability, uniformly on [0, T] , to the stochastic integral 
J Q a(s— ) dL(s). So, finally, 

sup \G n (t)-G(t)\ ^ 0, asrwoo. (5.22) 

0<t<T 

Part (iv): Convergence of the Skorokhod distance 

Finally, we have to transfer from the tilde processes a^(t) and G n (t) to the desired 
approximating processes in (2.9). <r„(i) and G n (t) are constant between jump times 
T in = T i-n A t%{n) , for n > 1, so we can write for < t < T, 

al (ti(n)) =a 2 n (r* n ) and G n (U(n)) = G n (r* n ). 

To obtain the convergence of (G n ,cr^) to (G,cr 2 ) in the Skorokhod distance, it is crucial 
to note that both processes a 2 and G n jump simultaneously and at most once in every 
interval (tj_i(n), tj,(n)] for i = 1, . . . , N n . The time change X(t) required in the Skorokhod 
distance can thus be specified pathwisc as follows. On the grid (4j(n))i = i ) ... ) iv„_i, define 

A n (ti(n);w) =r*„(w) =Ti >n (w) AU(n) for i = 1, . . . , N n - 1, 

with X n (0;w) = = to(n) and \ n (T;u>) = T = tN n (n), and interpolate piecewise linearly 
(hence continuously) between these points, thus obtaining a function X n (-;co) in A. By 
this construction, we see that 

sup |A„(£; u>) — 1\ < At(n). 

0<t<T 
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With the specification X n (T;uj) = T at the endpoint, required for A £ A, we ignore any 
jump in the last subintcrval (tN n -i,T\. However, the event A n = {rjv n ,n 5= T} has prob- 
ability bounded by AtN n (n)H(m n ) = o(yAt(n)) — > as n — > cxj, thus this modification 
is asymptotically negligible. 

The definition of X n (-;ui) allows us to write, on 

o*(t)=d*(\ n (t',u>)) and G n (t) = G n (X n (t;u)) for < t < T. 
This implies 

sup \al(t)-a 2 (\ n (t))\= sup \a 2 n (X n (t)) - a 2 (X n (t))\ 

0<t<T 0<t<T 

= sup \al(t)-a 2 (t)\ 

0<t<T 

and 

sup \G n (t)-G(X n (t))\= sup \G n (X n (t))-G(X n (t))\ 

0<t<T 0<t<T 

= sup \G n (t)-G(t)\. 

0<t<T 

Therefore, we can bound the Skorokhod distance by 

p({G n ,a n ),(G,a 2 ))< sup \G n (t) - G(t)\ + sup \a 2 n (t) - a 2 (t)\ + At(n) 

0<t<T 0<t<T 

and this expression tends to in probability by (5.19) and (5.22), finishing the proof. □ 
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